Reference-Based Speech Enhancement via Feature Alignment and Fusion Network

نویسندگان

چکیده

Speech enhancement aims at recovering a clean speech from noisy input, which can be classified into single and personalized enhancement. Personalized usually utilizes the speaker identity extracted itself (or reference speech) as global embedding to guide process. Different them, we observe that speeches of same are correlated in terms frame-level short-time Fourier Transform (STFT) spectrogram. Therefore, propose reference-based via feature alignment fusion network (FAF-Net). Given spoken by speaker, first level strategy warp with frame level. Then, fuse similarity-based strategy. Finally, fused features skipped connected decoder, generates enhanced results. Experimental results demonstrate performance proposed FAF-Net is close state-of-the-art methods on both DNS Voice Bank+DEMAND datasets. Our code available https://github.com/HieDean/FAF-Net.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

Fuse: multiple network alignment via data fusion

MOTIVATION Discovering patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. However, the complexity of the multiple network alignment problem grows exponentially ...

متن کامل

Speech Enhancement via EMD

In this study, two new approaches for speech signal noise reduction based on the empirical mode decomposition (EMD) recently introduced by Huang et al. (1998) are proposed. Based on the EMD, both reduction schemes are fully data-driven approaches. Noisy signal is decomposed adaptively into oscillatory components called intrinsic mode functions (IMFs), using a temporal decomposition called sifti...

متن کامل

Subtractive Clustering Based Feature Enhancement for Isolated Malay Speech Recognition

This paper proposes a new hybrid method named SCFE-PNN, which integrates effective subtractive clustering based features enhancement and probabilistic neural network (PNN) classifier, had been introduced for isolated Malay word recognition. The proposed method of subtractive clustering features weighting is used as a data preprocessing tool, which designs at diminishing the divergence in featur...

متن کامل

Feature-based image alignment via coupled Hough transforms

We investigate the problem of feature based image alignment under vertical and horizontal translation and scaling. The problem of finding the two dimensional affine alignment without rotation can be separated into two coupled simpler problems. The resulting two model fitting problems are coupled through a consensuses set of points that fit both models. The proposed solution can be viewed as a c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i10.21419